Data Toolbar

Data Toolbar
Developer(s) DataTool Services
Operating system Microsoft Windows
Type Browser toolbar, Web scraping
Website www.datatoolbar.com

Data Toolbar is an Internet Explorer add-on to collect catalog style information from the web. The add-on converts structured web data into a table style format that can be loaded into a spreadsheet or a database.[1]

Contents

Algorithm

The program implements a variation of the genetic tree matching algorithm with respect to nested lists.[2] That is, inside a given website, the program recursively traverses the branches of its DOM tree, aiming to detect nested lists of data items matching the format of the specified content. This approach has several known advantages over a simple string matching algorithm.[3]

Features

Similar Tools

Sources

  1. ^ "A guide to the mortgage banking industry's leading providers of high-tech products and services". The Journal for Mortgage Banking Professionals (Zackin Publications) 25 (2): 14. January 2011. http://issuu.com/zackinpublications/docs/sme1101_online. 
  2. ^ Alberto H. F. Laender, Berthier A. Ribeiro-Neto, Altigran S. da Silva, Juliana S. Teixeira A Brief Survey of Web Data Extraction Tools ACM SIGMOD Volume 31 Issue 2
  3. ^ Nitin Jindal, Bing Liu A Generalized Tree Matching Algorithm Considering Nested Lists for Web Data Extraction Proceedings of the Tenth SIAM International Conference on Data Mining, 2010

External links